{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/track-and-monitor-experiments/logging-api/logging-api.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Logging\n",
        "\n",
        "_**This notebook showcases various ways to use the Azure Machine Learning service run logging APIs, and view the results in the Azure portal.**_\n",
        "\n",
        "---\n",
        "---\n",
        "\n",
        "## Table of Contents\n",
        "\n",
        "1. [Introduction](#Introduction)\n",
        "1. [Setup](#Setup)\n",
        "    1. Validate Azure ML SDK installation\n",
        "    1. Initialize workspace\n",
        "    1. Set experiment\n",
        "1. [Logging](#Logging)\n",
        "    1. Starting a run\n",
        "        1. Viewing a run in the portal\n",
        "        1. Viewing the experiment in the portal\n",
        "    1. Logging metrics\n",
        "        1. Logging string metrics\n",
        "        1. Logging numeric metrics\n",
        "        1. Logging vectors\n",
        "        1. Logging tables\n",
        "        1. Uploading files\n",
        "1. [Analyzing results](#Analyzing-results)\n",
        "    1. Tagging a run\n",
        "1. [Next steps](#Next-steps)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Introduction\n",
        "\n",
        "Logging metrics from runs in your experiments allows you to track results from one run to another, determining trends in your outputs and understand how your inputs correspond to your model and script performance.  Azure Machine Learning services (AzureML) allows you to track various types of metrics including images and arbitrary files in order to understand, analyze, and audit your experimental progress. \n",
        "\n",
        "Typically you should log all parameters for your experiment and all numerical and string outputs of your experiment.  This will allow you to analyze the performance of your experiments across multiple runs, correlate inputs to outputs, and filter runs based on interesting criteria.\n",
        "\n",
        "The experiment's Run History report page automatically creates a report that can be customized to show the KPI's, charts, and column sets that are interesting to you. \n",
        "\n",
        "| ![Run Details](./img/run_details.PNG) | ![Run History](./img/run_history.PNG) |\n",
        "|:--:|:--:|\n",
        "| *Run Details* | *Run History* |\n",
        "\n",
        "---\n",
        "\n",
        "## Setup\n",
        "\n",
        "If you are using an Azure Machine Learning Notebook VM, you are all set.  Otherwise, go through the [configuration](../../../configuration.ipynb) Notebook first if you haven't already to establish your connection to the AzureML Workspace. Also make sure you have tqdm and matplotlib installed in the current kernel.\n",
        "\n",
        "```\n",
        "(myenv) $ conda install -y tqdm matplotlib\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Validate Azure ML SDK installation and get version number for debugging purposes"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "install"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core import Experiment, Workspace, Run\n",
        "import azureml.core\n",
        "import numpy as np\n",
        "from tqdm import tqdm\n",
        "\n",
        "# Check core SDK version number\n",
        "\n",
        "print(\"This notebook was created using SDK version 1.1.0rc0, you are currently running version\", azureml.core.VERSION)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Initialize workspace\n",
        "\n",
        "Initialize a workspace object from persisted configuration."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "create workspace"
        ]
      },
      "outputs": [],
      "source": [
        "ws = Workspace.from_config()\n",
        "print('Workspace name: ' + ws.name, \n",
        "      'Azure region: ' + ws.location, \n",
        "      'Subscription id: ' + ws.subscription_id, \n",
        "      'Resource group: ' + ws.resource_group, sep='\\n')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Set experiment\n",
        "Create a new experiment (or get the one with the specified name).  An *experiment* is a container for an arbitrary set of *runs*. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "experiment = Experiment(workspace=ws, name='logging-api-test')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "\n",
        "## Logging\n",
        "In this section we will explore the various logging mechanisms.\n",
        "\n",
        "### Starting a run\n",
        "\n",
        "A *run* is a singular experimental trial.  In this notebook we will create a run directly on the experiment  by calling `run = exp.start_logging()`.  If you were experimenting by submitting a script file as an experiment using ``experiment.submit()``, you would call `run = Run.get_context()` in your script to access the run context of your code.  In either case, the logging methods on the returned run object work the same.\n",
        "\n",
        "This cell also stores the run id for use later in this notebook.  The run_id is not necessary for logging."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# start logging for the run\n",
        "run = experiment.start_logging()\n",
        "\n",
        "# access the run id for use later\n",
        "run_id = run.id\n",
        "\n",
        "# change the scale factor on different runs to see how you can compare multiple runs\n",
        "scale_factor = 2\n",
        "\n",
        "# change the category on different runs to see how to organize data in reports\n",
        "category = 'Red'"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Viewing a run in the Portal\n",
        "Once a run is started you can see the run in the portal by simply typing ``run``.  Clicking on the \"Link to Portal\" link will take you to the Run Details page that shows the metrics you have logged and other run properties.  You can refresh this page after each logging statement to see the updated results."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "run"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Viewing an experiment in the portal\n",
        "You can also view an experiement similarly by typing `experiment`.  The portal link will take you to the experiment's Run History page that shows all runs and allows you to analyze trends across multiple runs."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "experiment"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Logging metrics\n",
        "Metrics are visible in the run details page in the AzureML portal and also can be analyzed in experiment reports.  The run details page looks as below and contains tabs for Details, Outputs, Logs, and Snapshot.  \n",
        "* The Details page displays attributes about the run, plus logged metrics and images.  Metrics that are vectors appear as charts.  \n",
        "* The Outputs page contains any files, such as models, you uploaded into the \"outputs\" directory from your run into storage.  If you place files in the \"outputs\" directory locally, the files are automatically uploaded on your behald when the run is completed.\n",
        "* The Logs page allows you to view any log files created by your run.  Logging runs created in notebooks typically do not generate log files.\n",
        "* The Snapshot page contains a snapshot of the directory specified in the ''start_logging'' statement, plus the notebook at the time of the ''start_logging'' call.  This snapshot and notebook can be downloaded from the Run Details page to continue or reproduce an experiment.\n",
        "\n",
        "### Logging string metrics\n",
        "The following cell logs a string metric.  A string metric is simply a string value associated with a name.  A string metric String metrics are useful for labelling runs and to organize your data.  Typically you should log all string parameters as metrics for later analysis - even information such as paths can help to understand how individual experiements perform differently.\n",
        "\n",
        "String metrics can be used in the following ways:\n",
        "* Plot in hitograms\n",
        "* Group by indicators for numerical plots\n",
        "* Filtering runs\n",
        "\n",
        "String metrics appear in the **Tracked Metrics** section of the Run Details page and can be added as a column in Run History reports."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# log a string metric\n",
        "run.log(name='Category', value=category)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Logging numerical metrics\n",
        "The following cell logs some numerical metrics.  Numerical metrics can include metrics such as AUC or MSE.  You should log any parameter or significant output measure in order to understand trends across multiple experiments.  Numerical metrics appear in the **Tracked Metrics** section of the Run Details page, and can be used in charts or KPI's in experiment Run History reports."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# log numerical values\n",
        "run.log(name=\"scale factor\", value = scale_factor)\n",
        "run.log(name='Magic Number', value=42 * scale_factor)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Logging vectors\n",
        "Vectors are good for recording information such as loss curves. You can log a vector by creating a list of numbers, calling ``log_list()`` and supplying a name and the list, or by repeatedly logging a value using the same name.\n",
        "\n",
        "Vectors are presented in Run Details as a chart, and are directly comparable in experiment reports when placed in a chart. \n",
        "\n",
        "**Note:** vectors logged into the run are expected to be relatively small. Logging very large vectors into Azure ML can result in reduced performance. If you need to store large amounts of data associated with the run, you can write the data to file that will be uploaded."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fibonacci_values = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]\n",
        "scaled_values = (i * scale_factor for i in fibonacci_values)\n",
        "\n",
        "# Log a list of values. Note this will generate a single-variable line chart.\n",
        "run.log_list(name='Fibonacci', value=scaled_values)\n",
        "\n",
        "for i in tqdm(range(-10, 10)):\n",
        "    # log a metric value repeatedly, this will generate a single-variable line chart.\n",
        "    run.log(name='Sigmoid', value=1 / (1 + np.exp(-i)))\n",
        "    "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Logging tables\n",
        "Tables are good for recording related sets of information such as accuracy tables, confusion matrices, etc.  \n",
        "You can log a table in two ways:\n",
        "* Create a dictionary of lists where each list represents a column in the table and call ``log_table()``\n",
        "* Repeatedly call ``log_row()`` providing the same table name with a consistent set of named args as the column values\n",
        "\n",
        "Tables are presented in Run Details as a chart using the first two columns of the table  \n",
        "\n",
        "**Note:** tables logged into the run are expected to be relatively small.  Logging very large tables into Azure ML can result in reduced performance.  If you need to store large amounts of data associated with the run, you can write the data to file that will be uploaded."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# create a dictionary to hold a table of values\n",
        "sines = {}\n",
        "sines['angle'] = []\n",
        "sines['sine'] = []\n",
        "\n",
        "for i in tqdm(range(-10, 10)):\n",
        "    angle = i / 2.0 * scale_factor\n",
        "    \n",
        "    # log a 2 (or more) values as a metric repeatedly. This will generate a 2-variable line chart if you have 2 numerical columns.\n",
        "    run.log_row(name='Cosine Wave', angle=angle, cos=np.cos(angle))\n",
        "        \n",
        "    sines['angle'].append(angle)\n",
        "    sines['sine'].append(np.sin(angle))\n",
        "\n",
        "# log a dictionary as a table, this will generate a 2-variable chart if you have 2 numerical columns\n",
        "run.log_table(name='Sine Wave', value=sines)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Logging images\n",
        "You can directly log _matplotlib_ plots and arbitrary images to your run record.  This code logs a _matplotlib_ pyplot object.  Images show up in the run details page in the Azure ML Portal."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%matplotlib inline\n",
        "\n",
        "# Create a plot\n",
        "import matplotlib.pyplot as plt\n",
        "angle = np.linspace(-3, 3, 50) * scale_factor\n",
        "plt.plot(angle,np.tanh(angle), label='tanh')\n",
        "plt.legend(fontsize=12)\n",
        "plt.title('Hyperbolic Tangent', fontsize=16)\n",
        "plt.grid(True)\n",
        "\n",
        "# Log the plot to the run.  To log an arbitrary image, use the form run.log_image(name, path='./image_path.png')\n",
        "run.log_image(name='Hyperbolic Tangent', plot=plt)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Uploading files\n",
        "\n",
        "Files can also be uploaded explicitly and stored as artifacts along with the run record. These files are also visible in the *Outputs* tab of the Run Details page.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "file_name = 'outputs/myfile.txt'\n",
        "\n",
        "with open(file_name, \"w\") as f:\n",
        "    f.write('This is an output file that will be uploaded.\\n')\n",
        "\n",
        "# Upload the file explicitly into artifacts \n",
        "run.upload_file(name = file_name, path_or_stream = file_name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Completing the run\n",
        "\n",
        "Calling `run.complete()` marks the run as completed and triggers the output file collection.  If for any reason you need to indicate the run failed or simply need to cancel the run you can call `run.fail()` or `run.cancel()`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "run.complete()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "\n",
        "## Analyzing results"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can refresh the run in the Azure portal to see all of your results.  In many cases you will want to analyze runs that were performed previously to inspect the contents or compare results.  Runs can be fetched from their parent Experiment object using the ``Run()`` constructor or the ``experiment.get_runs()`` method. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fetched_run = Run(experiment, run_id)\n",
        "fetched_run"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Call ``run.get_metrics()`` to retrieve all the metrics from a run."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fetched_run.get_metrics()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Call ``run.get_metrics(name = <metric name>)`` to retrieve a metric value by name. Retrieving a single metric can be faster, especially if the run contains many metrics."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fetched_run.get_metrics(name = \"scale factor\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "See the files uploaded for this run by calling ``run.get_file_names()``"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fetched_run.get_file_names()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Once you know the file names in a run, you can download the files using the ``run.download_file()`` method"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "os.makedirs('files', exist_ok=True)\n",
        "\n",
        "for f in run.get_file_names():\n",
        "    dest = os.path.join('files', f.split('/')[-1])\n",
        "    print('Downloading file {} to {}...'.format(f, dest))\n",
        "    fetched_run.download_file(f, dest)   "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Tagging a run\n",
        "Often when you analyze the results of a run, you may need to tag that run with important personal or external information.  You can add a tag to a run using the ``run.tag()`` method.  AzureML supports valueless and valued tags."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fetched_run.tag(\"My Favorite Run\")\n",
        "fetched_run.tag(\"Competition Rank\", 1)\n",
        "\n",
        "fetched_run.get_tags()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Next steps\n",
        "To experiment more with logging and to understand how metrics can be visualized, go back to the *Start a run* section, try changing the category and scale_factor values and going through the notebook several times.  Play with the KPI, charting, and column selection options on the experiment's Run History reports page to see how the various metrics can be combined and visualized.\n",
        "\n",
        "After learning about all of the logging options, go to the [train on remote vm](..\\train-on-remote-vm\\train-on-remote-vm.ipynb) notebook and experiment with logging from remote compute contexts."
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "roastala"
      }
    ],
    "category": "other",
    "compute": [
      "None"
    ],
    "datasets": [
      "None"
    ],
    "deployment": [
      "None"
    ],
    "exclude_from_index": false,
    "framework": [
      "None"
    ],
    "friendly_name": "Logging APIs",
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.8"
    },
    "order_index": 1,
    "star_tag": [],
    "tags": [
      "None"
    ],
    "task": "Logging APIs and analyzing results"
  },
  "nbformat": 4,
  "nbformat_minor": 2
}